Search CORE

951 research outputs found

An Open source Implementation of ITU-T Recommendation P.808 with Validation

Author: Cutler Ross
Naderi Babak
Publication venue: 'International Speech Communication Association'
Publication date: 16/05/2020
Field of study

The ITU-T Recommendation P.808 provides a crowdsourcing approach for conducting a subjective assessment of speech quality using the Absolute Category Rating (ACR) method. We provide an open-source implementation of the ITU-T Rec. P.808 that runs on the Amazon Mechanical Turk platform. We extended our implementation to include Degradation Category Ratings (DCR) and Comparison Category Ratings (CCR) test methods. We also significantly speed up the test process by integrating the participant qualification step into the main rating task compared to a two-stage qualification and rating solution. We provide program scripts for creating and executing the subjective test, and data cleansing and analyzing the answers to avoid operational errors. To validate the implementation, we compare the Mean Opinion Scores (MOS) collected through our implementation with MOS values from a standard laboratory experiment conducted based on the ITU-T Rec. P.800. We also evaluate the reproducibility of the result of the subjective speech quality assessment through crowdsourcing using our implementation. Finally, we quantify the impact of parts of the system designed to improve the reliability: environmental tests, gold and trapping questions, rating patterns, and a headset usage test

arXiv.org e-Print Archive

Crossref

Macular Bioaccelerometers on Earth and in Space

Author: Cutler L.
Lam T.
Meyer G.
Ross M. D.
Vazin P.
Publication venue
Publication date
Field of study

Space flight offers the opportunity to study linear bioaccelerometers (vestibular maculas) in the virtual absence of a primary stimulus, gravitational acceleration. Macular research in space is particularly important to NASA because the bioaccelerometers are proving to be weighted neural networks in which information is distributed for parallel processing. Neural networks are plastic and highly adaptive to new environments. Combined morphological-physiological studies of maculas fixed in space and following flight should reveal macular adaptive responses to microgravity, and their time-course. Ground-based research, already begun, using computer-assisted, 3-dimensional reconstruction of macular terminal fields will lead to development of computer models of functioning maculas. This research should continue in conjunction with physiological studies, including work with multichannel electrodes. The results of such a combined effort could usher in a new era in understanding vestibular function on Earth and in space. They can also provide a rational basis for counter-measures to space motion sickness, which may prove troublesome as space voyager encounter new gravitational fields on planets, or must re-adapt to 1 g upon return to earth

NASA Technical Reports Server

Trustworthy Experimentation Under Telemetry Loss

Author: Cutler Ross
Dmitriev Pavel
Ellis Martin
Gupchup Jayant
Hosseinkashi Yasaman
Jefremov Andrei
Schneider Daniel
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 21/01/2019
Field of study

Failure to accurately measure the outcomes of an experiment can lead to bias and incorrect conclusions. Online controlled experiments (aka AB tests) are increasingly being used to make decisions to improve websites as well as mobile and desktop applications. We argue that loss of telemetry data (during upload or post-processing) can skew the results of experiments, leading to loss of statistical power and inaccurate or erroneous conclusions. By systematically investigating the causes of telemetry loss, we argue that it is not practical to entirely eliminate it. Consequently, experimentation systems need to be robust to its effects. Furthermore, we note that it is nontrivial to measure the absolute level of telemetry loss in an experimentation system. In this paper, we take a top-down approach towards solving this problem. We motivate the impact of loss qualitatively using experiments in real applications deployed at scale, and formalize the problem by presenting a theoretical breakdown of the bias introduced by loss. Based on this foundation, we present a general framework for quantitatively evaluating the impact of telemetry loss, and present two solutions to measure the absolute levels of loss. This framework is used by well-known applications at Microsoft, with millions of users and billions of sessions. These general principles can be adopted by any application to improve the overall trustworthiness of experimentation and data-driven decision making.Comment: Proceedings of the 27th ACM International Conference on Information and Knowledge Management, October 201

arXiv.org e-Print Archive

Crossref

Multi-dimensional Speech Quality Assessment in Crowdsourcing

Author: Cutler Ross
Naderi Babak
Ristea Nicolae-Catalin
Publication venue
Publication date: 13/09/2023
Field of study

Subjective speech quality assessment is the gold standard for evaluating speech enhancement processing and telecommunication systems. The commonly used standard ITU-T Rec. P.800 defines how to measure speech quality in lab environments, and ITU-T Rec.~P.808 extended it for crowdsourcing. ITU-T Rec. P.835 extends P.800 to measure the quality of speech in the presence of noise. ITU-T Rec. P.804 targets the conversation test and introduces perceptual speech quality dimensions which are measured during the listening phase of the conversation. The perceptual dimensions are noisiness, coloration, discontinuity, and loudness. We create a crowdsourcing implementation of a multi-dimensional subjective test following the scales from P.804 and extend it to include reverberation, the speech signal, and overall quality. We show the tool is both accurate and reproducible. The tool has been used in the ICASSP 2023 Speech Signal Improvement challenge and we show the utility of these speech quality dimensions in this challenge. The tool will be publicly available as open-source at https://github.com/microsoft/P.808

arXiv.org e-Print Archive

VCD: A Video Conferencing Dataset for Video Compression

Author: Cutler Ross
Hosseinkashi Yasaman
Khongbantabam Nabakumar Singh
Naderi Babak
Publication venue
Publication date: 13/09/2023
Field of study

Commonly used datasets for evaluating video codecs are all very high quality and not representative of video typically used in video conferencing scenarios. We present the Video Conferencing Dataset (VCD) for evaluating video codecs for real-time communication, the first such dataset focused on video conferencing. VCD includes a wide variety of camera qualities and spatial and temporal information. It includes both desktop and mobile scenarios and two types of video background processing. We report the compression efficiency of H.264, H.265, H.266, and AV1 in low-delay settings on VCD and compare it with the non-video conferencing datasets UVC, MLC-JVC, and HEVC. The results show the source quality and the scenarios have a significant effect on the compression efficiency of all the codecs. VCD enables the evaluation and tuning of codecs for this important scenario. The VCD is publicly available as an open-source dataset at https://github.com/microsoft/VCD

arXiv.org e-Print Archive

Real-time Bandwidth Estimation from Offline Expert Demonstrations

Author: Cutler Ross
Gopal Vishak
Gottipati Aashish
Khairy Sami
Mittag Gabriel
Publication venue
Publication date: 23/09/2023
Field of study

In this work, we tackle the problem of bandwidth estimation (BWE) for real-time communication systems; however, in contrast to previous works, we leverage the vast efforts of prior heuristic-based BWE methods and synergize these approaches with deep learning-based techniques. Our work addresses challenges in generalizing to unseen network dynamics and extracting rich representations from prior experience, two key challenges in integrating data-driven bandwidth estimators into real-time systems. To that end, we propose Merlin, the first purely offline, data-driven solution to BWE that harnesses prior heuristic-based methods to extract an expert BWE policy. Through a series of experiments, we demonstrate that Merlin surpasses state-of-the-art heuristic-based and deep learning-based bandwidth estimators in terms of objective quality of experience metrics while generalizing beyond the offline world to in-the-wild network deployments where Merlin achieves a 42.85% and 12.8% reduction in packet loss and delay, respectively, when compared against WebRTC in inter-continental videoconferencing calls. We hope that Merlin's offline-oriented design fosters new strategies for real-time network control

arXiv.org e-Print Archive

Meeting effectiveness and inclusiveness: large-scale measurement, identification of key features, and prediction in real-world remote meetings

Author: Cutler Ross
Hosseinkashi Yasaman
Madan Chinmaya
Pool Jamie
Tankelevitch Lev
Publication venue
Publication date: 07/10/2023
Field of study

Workplace meetings are vital to organizational collaboration, yet relatively little progress has been made toward measuring meeting effectiveness and inclusiveness at scale. The recent rise in remote and hybrid meetings represents an opportunity to do so via computer-mediated communication (CMC) systems. Here, we share the results of an effective and inclusive meetings survey embedded within a CMC system in a diverse set of companies and organizations. We correlate the survey results with objective metrics available from the CMC system to identify the generalizable attributes that characterize perceived effectiveness and inclusiveness in meetings. Additionally, we explore a predictive model of meeting effectiveness and inclusiveness based solely on objective meeting attributes. Lastly, we show challenges and discuss solutions around the subjective measurement of meeting experiences. To our knowledge, this is the largest data-driven study conducted after the pandemic peak to measure, understand, and predict effectiveness and inclusiveness in real-world meetings at an organizational scale

arXiv.org e-Print Archive